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The distribution of recurrence times or return intervals between extreme events is important to 
characterize and understand the behavior of physical systems and phenomena in many disciplines. 
It is well known that many physical processes in nature and society display long range correlations. 
Hence, in the last few years, considerable research effort has been directed towards studying the 
distribution of return intervals for long range correlated time series. Based on numerical simulations, 
it was shown that the return interval distributions are of stretched exponential type. In this paper, we 
obtain an analytical expression for the distribution of return intervals in long range correlated time 
series which holds good when the average return intervals are large. We show that the distribution 
is actually a product of power law and a stretched exponential form. We also discuss the regimes 
of validity and perform detailed studies on how the return interval distribution depends on the 
threshold used to define extreme events. 
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I. INTRODUCTION 

Extreme events take place frequently in both nature 
and society. For instance, the recurrence of floods, 
droughts, earth quakes and economic recession are all ex- 
amples of extreme events. The consequences of extreme 
events to life and property are often enormous and hence 
it is desirable to study their properties and questions re- 
lated to their predictability. Interestingly, all of these 
extreme events are also non-equilibrium phenomena and 
studying the extreme value statistics in them will lead 
to a better understanding of the models and the phe- 
nomenology of non-equilibrium statistical physics. Thus, 
there is an increasing interest in the physics literature to 
understand a broad range of issues and phenomena con- 
nected with the occurance of extreme events and their 
dynamics P, 0]. 

In the classical extreme value theory, the limiting dis- 
tribution for the extreme maximal values in sequences of 
independent and identically distributed random variables 
can be one of the Frechet, Gumbel or Weibull distribu- 
tion depending on the behavior of the tail of the prob- 
ability density Q. This has been empirically verified in 
many cases of practical interest. Many new applications 
continue to be discovered, for example, the recent one be- 
ing the distribution of extreme components of the eigen- 
modes of quantum chaotic systems |4J . In contrast to the 
questions about the distribution of extrema, one of the 
problems being addressed in the last few years is the dis- 
tribution of the returns intervals for the extreme events 
when the underlying time series displays long memory 
[E H, 0, H, [1] ■ This is primarily motivated by the fact 
that many of the natural and socio-economic phenomena, 
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e.g., daily temperature, DNA sequences, river run-off, 
earth quakes, stock markets etc., display long memory 
or long range correlation [lol [TTj] . Long memory implies 
slowly decaying auto correlation function of the power 
law type such that the system does not exhibit typical 
time scales. In this case, the intervals between extreme 
events are likely to be correlated as well. On the con- 
trary, it is known that for an uncorrelated time series, 
intervals between extreme events are also uncorrelated 
and are exponentially distributed. The question is how 
the presence of long range correlation modifies the re- 
turn interval distribution of extreme events ? A definite 
answer to this question would shed new light on many 
problems across various disciplines. 

Return interval distributions are interesting and useful 
for several reasons, the most important being that many 
problems in diverse fields can be formulated in terms of 
return interval statistics with wide ranging applications. 
For instance, the problem of recurrence time interval be- 
tween earthquakes above a given magnitude (T^ |. x-ray 
solar flare recurrences [13|. statistics of acoustic emis- 
sion from rock fractures [14j |. inter arrival packet times 
on computer and cellular networks [lo] ] and the classical 
problem of Poincare recurrences in Hamiltonian systems 
(lq | can all be formulated as extreme event questions in- 
volving return interval distribution. In a non-stationary 
time series, it is often difficult to reliably estimate its tem- 
poral statistical properties such as the autocorrelations 
or higher order correlations. Thus, return interval dis- 
tributions are also a useful tool to characterize temporal 
properties of such systems. 

Let x(t) denote a sequence of random variable, where 
t is the time index. We will call an event extreme if 
x(t) > q where q is some threshold value. The return 
interval r is the time between successive occurance of 
extreme events. With respect to threshold q, we have a 
well-defined series of return intervals, rfe, k — 1,2, 3, ...N. 
This is schematically shown in Fig [TJ If the random 
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FIG. 1: This schematic diagram shows the return intervals 
for a threshold value q — 2 as a function of time t. 



variables x(t) are uncorrelated, then the return intervals 
Tk are also uncorrelated and they are exponentially dis- 
tributed as 



Pq(r) = t\ e 
(r) 



-r/(r) 



(1) 



In order to use later, we also define the average return 
interval dependent on threshold q to be 



lim 



1 N 

TV ^ 



(2) 
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In contrast to an uncorrelated time series, a long range 
correlated series has an autocorrelation function that dis- 
plays power law of the form, 



C(r) = (x(t + T) x{t)) 



(0< 7 <1), (3) 



where (.) denotes the temporal average and 7 is the auto 
correlation exponent. The work done in the last few years 
show that the long range correlation does indeed affect 
the return interval distribution of extreme events 

2, la la • Empirical results in a series of papers 

3, H, @] have shown that, in the presence of long range 
correlation, the return interval distribution becomes a 
stretched exponential given by, 



P q (R) = A( 7 ) e 



-B( 7 ) IV 



(4) 



with scaled return intervals being defined as R — r / (r) . 
Both A(pf) and .8(7) are constants that depend on 7. 
They can be fixed by normalizing both the probability 
and the average return interval to unity. It has also been 
shown that the return intervals themselves are long range 
correlated. 

However, an analytical justification for the stretched 
exponential distribution in Eq. [4] is still lacking and the 
main contribution of this paper is to partly fill this void. 
In this context, it must be noted that deviations from 
the stretched exponential distribution in Eq. f?]have been 
noted for return intervals shorter (R < 1) than the av- 
erage. For short return intervals, i.e, R < 1, empirical 
results display a power law with the exponent ~ (7 — 1) 
Q, which is not explained by Eq. 0J While the return 



interval distribution is expected to depend on the thresh- 
old q, the stretched exponential form does not explicitly 
reveal this dependence. This paper addresses these ques- 
tions using a combination of analytical and numerical 
results. Firstly, from theoretical arguments, we obtain 
an approximate expression for the return interval dis- 
tribution, which modifies Eq. 2] from a purely stretched 
exponential form to a product of power law and stretched 
exponential. Secondly, we systematically study the de- 
pendence of return interval distribution on the threshold 
q and show that our analytical result holds good in the 
limit of q » 1. In general, the return interval distribu- 
tion depends on the value of threshold q. 

Recently, in the study of global seismic activity above 
some magnitude M, the distribution F(t) of recurrence 
times t have been shown to follow a scaling ansatz of the 
form 



F(r) = i/(r/f), 

T 



(5) 



where the function /(r/f) is the gamma distribution [12j |. 
In fact, this scaling relation seems to hold good for forest 
fire occurance intervals [l7j . tsunami inter event times 
[l8| and ion channel currents in voltage dependent an- 
ion channels in the cell [19j. The analytical distribution 
obtained in this paper might shed light on this scaling 
found in a variety of systems. 

In the next section, we obtain an analytical expression 
for the return interval distribution and in the subsequent 
section we present our numerical results. Further, we sys- 
tematically study the dependence of the return interval 
distribution on the threshold used to define the extreme 
event. Finally, we present our conclusions. 



II. RETURN TIME DISTRIBUTION 

The starting point of our approach is to transform the 
given long range correlated time series x{t) with auto- 
correlation exponent 7 into a binary sequence with 1 at 
positions of extreme events and elsewhere. Thus, we 
obtain new binary sequence defined by, 



y(t) = 1, if x(t) > q 
= 0, if x(t) < q 



(6) 



We use the empirical result that for a long range cor- 
related time series with the autocorrelation exponent 7, 
the return intervals are also long range correlated with 
the same exponent. Thus, our probability model is the 
statement that given an extreme event at time t = 0, 
the probability to find an extreme event at time t = r is 
given by, 



Pex(r) 



= nr-( 2H - 1 ) =«r-(l-7) 



(7) 



where 1/2 < if < 1 is the Hurst exponent and a is 
the normalization constant that will be fixed later. We 
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have also used the well-known relation between Hurst ex- 
ponent and autocorrelation exponent; 7 = 2 — 2H. Equa- 
tion [7] implies that after an extreme event it is highly 
probable to expect the next event to be an extreme one 
too; and this is a reasonable proposition for a persistent 
time series. Notice also that for an uncorrelated time 
series H = 1/2. This leads to P(r) in Eq. [7J becoming 
independent of r, as would be expected for an uncorre- 
lated time series. Further support for this proposition 
comes from the theorem due to Newell and Rosenblatt 
[2ll |22| obtained in the context of zero crossing probabil- 
ities for Gaussian processes. It states that for a separable 
Gaussian stationary process X(t) with mean (X) = 0, 
the probability g(T) that X(t) > for > t > T is 
g(T) = 0{T- a ) as T -> 00, where a > 0. 

Next we calculate the probability that given an ex- 
treme event at time t = 0, no extreme event occurs in 
the interval (0,r). For this, we divide the interval r into 

m sub-intervals indexed by j — 0, 1, 2 (m — 1) and we 

calculate this probability in each of the intervals. For the 
jth sub-interval, using Eq. [JJ the probability of extreme 
event is given by, 



h(j) 



a 
m 

a r 
2m 



r /(j + l)rV (M) 



(8) 



-(1-7) 



(j + l)r 



-(l- 7 )' 



After simplifying this expression, the probability that no 
extreme event occurs in the jth sub-interval is given by, 



ar / r \ -(1-7) 
l-h(j) = l-—(- 
Zm \m 



( j + l)-(l-7) +r (l-7) 



(9) 

At this point, we make an approximation and assume 
that the probability of no extreme event occurance in 
each sub-interval to be an independent event. Then, the 
probability P noe x (t) that no extreme event occurs in any 
of the m sub-intervals in (0, r) is simply the product of 
probabilities, 



Pnoex (r) = lim I"! l-h(j). 



(10) 



3=0 



The required probability P(r) dr is simply the product 
of Pnoex with the probability P ex that an extreme event 
takes place in the infinitesimal interval dr beyond r. This 
can be assembled together as, 

P(r) dr = P noe:c (r) Pex{r) dr 

= lim [1 - [1 - (2-t + 1)1 (11) 

m — >oo 

[l-0 m ,r(3- 7 + 2-T)] 

[1 - </w(™~ 7 + (m - I) -7 )] a r- {1 -t ] dr 



where, 



4>m.r = % (—) 

2 \mJ 



(12) 



The value of m can be arbitrarily large and the Eq. [TTJ 
can be simplified and rewritten as, 



(13) 



where H^_P is the generalized Harmonic number [2c 
In order to take the limit m — » 00, we note that 



ff (7-l) 1 

hm = — , 

m^oo 777,7 7 



(0< 7 <1). (14) 



Using this Eq. [14] in Eq. [13] and taking the limit, we 
obtain the following result for the distribution of return 
intervals; 



P{r)dr = a r ^ 7 ^ e -< rl dr. 



(15) 



The constant a will be fixed by normalization as follows 
; we demand that the total probability and the average 
return interval (r) be normalized to unity. 



(r) 



P(r) dr = 1 
r P(r) dr = 1 



and 



(16) 
(17) 



However, the distribution in Eq. [T5]is already normalized 
and hence Eq. [T7J will be used to determine the value of 
a. The requirement that (7'} = 1 is equivalent to trans- 
forming the return intervals r in units of (r) . Performing 
the integrals above, the normalized distribution in the 
variable R = r/(r) turns out to be, 



P(R) = 7 



1 + 7 

7 



e -I r ( W*"' (18) 



where T(.) is the Gamma function. First, we discuss 
some of the salient features of this distribution. The case 
7 = 1 defines the crossover to short range or uncorre- 
lated time series. If we put 7 = 1 in the distribution in 
Eq. [TS] above, we recover the exponential distribution, 
P(R) = exp(— R). In the region R <C 1, i.e., for the 
return intervals much below the average, the dominant 
behavior can be seen by taking logarithm on both sides 
of Eq. [T5] leading to, 

log P(R) =log( 7 <7 7 )- (1-7) log R- 9l R\ (19) 



where we have used <? 7 = 



0?) 



For R < 1, the 

second term dominates the distribution and thus we ob- 
tain a power law with an exponent (7 — 1); 



P(R) oc R-Q-l) 



(R « 1) 



(20) 



This power law behavior with exponent (7 — 1) for short 
return intervals has already been noted in the numerical 
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FIG. 2: (Color Online) The simulated return interval distri- 
bution (circles) and theoretical distribution in Eq [22] (solid 
lines) for long range correlated time series. The threshold is 
q = 3.0 with average return interval (r) = 743.0 for all the 
cases shown above. 



results presented in Ref. 0- Thus, our approach ana- 
lytically shows the emergence of a power law regime for 
short R in contrast to the stretched exponential distri- 
bution. On the other hand, for R 3> 1, the logarithmic 
term in Eq. [TO] can be dropped and the return interval 
distribution behaves essentially like a stretched exponen- 
tial distribution, 

P(R) ace-*> R \ (JZ»1) (21) 

Thus, stretched exponential is a good approximation for 
R 3> 1. This partly explains why a pure stretched expo- 
nential distribution as in Eq. 2] deviates, for R < 1, from 
the simulated return interval distributions in the earlier 
works [a, Is [D, [8|, ls| . Finally, we also note that Eq. [TH] 
can also be derived by other methods without actually 
discretising the interval r as we have done. 

As shown above, the return interval distribution in Eq. 
IT51 does reproduce the empirical results already known 
in the literature but is nevertheless approximate in the 
following sense. It is known that there exist correla- 
tions among the return intervals and they are particu- 
larly strong as 7 — > 0. Thus, every return interval de- 



pends on the value of previous return interval. This is 
also well documented in the literature as the conditional 
probability P(R\Rq) to find return interval R, given that 
the previous return interval was Rq @, H, Q • This condi- 
tional probability shows interesting features and deviates 
from the case of uncorrelated return intervals. Equation 
[TS1 does not take into account these correlations among 
intervals and in fact is derived on the assumption that 
return intervals are independent. This is a gross approx- 
imation though in the absence any other definitive model 
for the correlations among intervals this is a simple and 
analytically tractable choice. Based on this argument, 
one can expect Eq. [T5] to describe the return interval 
statistics in the regime where the correlations are not 
highly dominant, for (r) » 1 (24|. Secondly, note that 
even though threshold q plays a crucial role as we will 
describe in the next section, it does not play any role in 
Eq. [T5] Threshold q is related to (r) such that higher the 
value of q, larger is (r), though it is not a linear relation. 
Thus, the theoretical arguments leading to Eq. IT51 would 
best describe an asymptotic limit of q 3> 1 or (r) » 1. 
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Using Eq. [Tg]in practice can lead to strong divergence 
for r — > 0. From a physical standpoint, this represents a 
problem that can be understood based on the fact that 
there cannot be zero return intervals, but they can be 
arbitrarily small. By definition r > 0, and if r m i n is the 
shortest return interval then its corresponding scaled ver- 
sion would be r min / (r). If the original signal is sampled 
at equal time intervals, r TO j n can be scaled to unity and 
the shortest scaled return interval would be The 
modification of Eq. [18] should be done by replacing the 
lower limit in the integrals in Eqs [16] and [17] by 1/ (r) 
instead of 0. This also reflects the general idea that all 
power laws in practice have a lower bound and the re- 
turn interval distribution like the Eq. 1181 that displays a 
power law type regime will necessarily have a lower cut 
off. 

We will go back to Eqn. [TS] and rewrite the return 
interval distribution as 



f(r) = B 



-(1-7) 



(22) 



where A and B are constants that would now depend on 
both 7 and the average return interval. As usual, both 
these constants will be fixed by demanding that probabil- 
ity and average return interval normalize to unity. This 
leads to the following set of integrals; 



f(r) dr = —e 
A 



1, 



r f(r) dr 



Bs 
A 



r(i/7,p) 



7 P 



1/7 



(23) 



(24) 



where so = l/(r), p = Asq/j and T(., .) is the incom- 
plete Gamma function [25|. The algebraic equations to 
be solved for A and B are transcendental in nature and 
closed form solution does not seem possible except for 
some special values. By further manipulation of Eqns. 
|2"31 and [Ml we obtain 



1 

■so 



= 1 



1 P 



1/7 



(25) 



If p = po is the solution of Eq[25]for a definite (r), then 
the constants can be obtained as, 



A = 



7 Po 



B = Ae P0 . 



(26) 



In the simulations shown in this paper, we have numeri- 
cally solved for constants A and B in Eq. [22] for various 
values of (r) using Eqns [231 and |2"61 



III. NUMERICAL RESULTS 

In this section, we display the numerical results for the 
return interval distribution of long range correlated time 
series drawn from a Gaussian distribution with zero mean 



and unit variance. The long range correlated data was 
generated using the Fourier filtering technique [2{J. We 
generate 2 25 ~ 3 x 10 7 data points for each values of 7 
and then compute their return interval distribution. The 
numerical results are displayed in Fig [5] as log-log plot 
for q = 3 along with the theoretical distributions given 
in Eqns. [T8"l and |2"21 The agreement with the theoretical 
distribution is good and as expected gets better as 7 — > 1. 
Similar good agreement is also obtained for the values of 
7 not shown here. The simulated results in Fig [2] does 
not cover a larger range in logi? because of the large 
value of threshold q chosen corresponding to an average 
return interval of (r) — 743.0. To over come this problem, 
we will need extremely large sequences of random time 
series. As we have argued in the previous section, the 
theoretical distribution can be expected to agree with the 
data when threshold q or equivalently the average return 
interval is large. Thus, as we reduce q below 2.5, there 
are deviations from the theoretical distribution which are 
systematically studied in the next section. 

In Fig [3] we show the power law regime indicated by 
Eq. [501 In this figure, we focus on the region R <C 1 
where we expect the power law to appear. For each value 
of 7 in Fig [31 we have drawn a straight line (shown in 
red) with the slope (—1 + 7). Quite clearly, the numer- 
ical data show a remarkably good agreement with the 
theoretical slope. As 7 — » 0, the power law regime holds 
good in a larger range of R ; for instance, see the case 
of 7 = 0.1 and 0.3. On the other hand, as seen in the 
case of 7 = 0.7, the power law region becomes shorter 
and stretched exponential regime begins to dominate as 
7 — ► 1. This is an indication that the return inter- 
val distribution makes a transition from predominantly 
(stretched) exponential behavior to predominantly power 
law type curve as 7 — > 0. It must be pointed out that 
the agreement with theoretically expected slope (— 1 +7) 
is reached only for q S> 1. This is to be expected since 
the derived distributions in Eqns [18] and [22] do not take 
into account the correlations among the return intervals. 
In the next section, we study how the slope in the power 
law regime changes with threshold q in the numerically 
simulated long range correlated data. 



IV. RETURN INTERVAL DISTRIBUTION AND 
THRESHOLD FOR EXTREME EVENTS 

In this section, we will empirically examine the rela- 
tion between the return interval distribution, especially 
in the power law regime, and the threshold q that define 
the extreme events. Intuitively, we can expect that if 
the threshold is higher, extreme events will be fewer and 
hence the return intervals will be longer. Thus, larger q 
leads to larger average return intervals. Here we address 
the question of how the return interval distributions in 
Eqns [18] and [22] are modified by changes in threshold 
value q. One clear indication is that, approximately for 
q < 2, the simulated return interval distributions devi- 
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FIG. 3: (Color Online) The return interval distribution fo- 
cused on the power law regime. The numerical distribution 
(circles) is nearly a straight line with the slope ( — 1 + 7). A 
straight line with slope (— 1 + 7) is shown as solid (red) line 
for comparison. For all the cases, q > 3.0 corresponding to 
(r) > 740.0. 



ate systematically from Eqns [18] and [22] in particular for 
R < 1. To study this, we plot the return interval distri- 
bution for the simulated data in a log-log plot as shown in 
Fig [5] and measure the slope in a linear region for R < 1 
for various values of q. The result is displayed in Fig [4] 
for 7 — 0.1. It is seen that as q increases, the initial 
part of the distribution, i.e, R < 1 or log R < 0, is closer 
to being a straight line with slope (— 1 + 7). A similar 
behavior is seen for all the values of 7 of our interest. 
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FIG. 4: (Color Online) The return interval distribution for 
the simulated data (circles) for 7 = 0.1 plotted for various 
values of threshold q. A straight line with slope ( — 1 + 7) 
is shown as solid (blue) line for comparison. Note that as q 
increases, the initial part of the distribution moves closer to 
a slope of ( — 1 + 7). 

In order to see this variation of the slope of the initial 
part of the distribution with q, we plot in Fig [5{a) the 
measured slope s m (q) against the threshold q for various 
values of 7. The slope is measured in the linear region in 
log-log plot for JJcl. For a given value of 7 = -f c , the 



slope increases monotonically to reach a saturation value 
of (—1 + 7 C ) as q — > 00. Once again we point out that 
this is in agreement with our expectation that the weakly 
correlated regime would agree with the distribution ob- 
tained in Eq. [18] and [22] For the Gaussian distributed 
data that we use, at q = 3, the average return interval 
is (r) w 744.0. Beyond q = 3 with 2 25 data points the 
number of returns intervals are not sufficient for reliable 
statistics. All this would imply that in order to take into 
account the effect of q, the power law proposed in Eq. [20] 
could be modified as 



P(R) oc R- {1 - 



■7)0(9,7) 



(27) 



with the restriction, suggested by the numerical results 
in FigO^a), that 7) — > 1 as q — > 00. Clearly, the 
measured slope is simply given by s m — —(1 — 7)0(5,7). 
Thus, we can directly visualize the function 6(q, 7) if we 
plot s m (<z)/(7 — 1) as a function of q. This is shown in 
FigOJb). As we anticipated, the function 7) tends 
towards unity as q — > 00. The autocorrelation exponent 
7 controls the rate at which the limiting value of unity is 
reached. We believe that the behavior displayed in Fig 
[SJa, b) is related to a more fundamental question of how 
the auto-correlation exponent of a long range correlated 
time series changes if it is subjected to thresholding such 
as the one we have applied using Eq. [5] Obviously, every 
time we choose a subset of events from a larger set, such 
as the extreme events, implicitly such thresholding is ap- 
plied. Since the power law regime varies with q and if the 
distribution has to remain normalized, then the stretched 
exponential part would also be modified. However, this 
might be difficult to visualize numerically. The central 
premise of this section is to show that Eqns [18] and [22] 
represent return interval distributions in the limit when 
the threshold or average return interval is large. We have 
shown through simulations the dependence of return in- 
terval distributions on threshold q. This explains why we 
have chosen q = 3 to illustrate our result in Fig [5] Thus, 
in principle, the exact return interval distribution should 
depend on (r), especially for short return intervals, i.e, 
R< 1. 



V. LONG RANGE PROBABILITY PROCESS 

Apart from corrections arising due to dependence on 
q, the return interval distribution derived in this paper 
suffers due to approximation arising from assumptions of 
independence of return intervals. This assumption makes 
the analysis tractable but does not reflect the reality since 
we know that the intervals are indeed correlated. In this 
section, we argue that the deviations from the numerical 
simulations evident in Fig [3 can be attributed to the 
presence of correlations in the return interval data. We 
do this by simulating the probability process in Eq. [7] 
that forms the basis for the analytical result in Eqn ITS1 
and [2U If the simulated data agrees with the analytical 
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In Fig. [6l we show the return interval distribution ob- 
tained by simulating our probability process along with 
the distribution given by Eq [22] The agreement with 
the theoretical distribution is excellent, including for the 
values of 7 not shown here. Hence, if the long range cor- 
related data had independent return intervals, then we 
would have obtained nearly perfect agreement with Eq 
[IS] and [22] This implies that the remaining disagreement 
between the theoretical and numerical results seen in Fig 
[5] can be attributed to the presence of correlations among 
the return intervals. On the other hand, if the probabil- 
ity process in Eq[7|was an incorrect assumption, it may 
not have been possible to obtain the results displayed in 
Figd 



FIG. 5: (Color Online) (a) The measured slope s m in the 
power law regime as a function of q for 7 = 0.1 (circles), 0.3 
(squares), 0.5 (triangles) and 0.7 (plus), (b) The function 
0(?j7) = s m/(7 — 1) as a function of q for same values of 7 
as in (a). 
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FIG. 6: (Color Online) The simulated return interval distribu- 
tion (circles) from the probability process in Eq. [7] compared 
with the theoretical distribution (solid line) given in Eq 1221 

result, then we could attribute the deviations seen in Fig 
[21 to the correlations present in the return intervals. 

In order to numerically simulate the probability pro- 
cess in Eq. we first determine the constant a by nor- 
malizing it in the region k m i n — 1 and k max . The nor- 
malized probability distribution corresponding to Eq. [7] 
is 

m = IF 7 -n fc ~ 7 ' (28) 

where k = 1,2,3 We generate a random number 

from a uniform distribution at every k and compare it 
with the value of P(k). A random number is accepted as 
an extreme event if < P(k) at any given value of k. If 
£fc > P(k), then it is not an extreme event. By this pro- 
cedure, we generate a series of extreme events following 
Eq. [7] We then compute the return intervals and its dis- 
tribution after scaling it by the average return interval. 



VI. DISCUSSIONS AND CONCLUSIONS 

We have studied the distribution of return intervals 
for the extreme events in long range correlated time se- 
ries. An approximate analytical expression for this dis- 
tribution has been obtained starting from the empirically 
established fact that returns intervals are long range cor- 
related. This distribution is a product of a power law and 
a stretched exponential and explains the observed power 
law for short return intervals. For large return intervals, 
the distribution is dominated by a stretched exponential 
decay. The works reported earlier have empirically pro- 
posed stretched exponential form for the return interval 
distribution which is now shown to be valid in the domain 
of large return intervals. Further, we have also carefully 
studied the role played by the threshold q or equivalently 
the average return interval in the return time statistics. 
We show that it modifies the return interval distribution, 
especially in the power law regime of short return inter- 
vals. We believe that the results obtained in this paper 
explains most of the empirically observed features in the 
return time distributions of long range correlated time 
series. In the simulations reported in this work, we have 
used Gaussian distributed random numbers. As stud- 
ied in Ref. 0, it is natural to ask if the exponential or 
power law distributed data would modify the results of 
this paper. We expect that the functional form of the 
distribution in Eq [22] would not be modified though the 
normalization constants A and B might change due to 
their dependence on the threshold q. The question of 
verifying the results of this paper with a measured time 
series is underway and would be reported elsewhere. 

As pointed out before, the inter-event time distribu- 
tion has applications across many disciplines. Hence, it 
appears in different settings in different areas. In the 
statistical literature, a related problem of zero crossings, 
i.e, the probability that X(t) > for > t > T has 
been considered. Under certain conditions, for a station- 
ary Gaussian process, the upper bound for zero crossing 
probability is shown to be a stretched exponential [2l| . 
This result does not strictly apply to the case of recur- 
rence interval statistics because the zero crossing prob- 
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ability does not make statements about occurrence or 
non-occurance of another zero crossing after the interval 
T. A return interval, by definition, requires two crossings 
separated by an interval with no crossings. Finally, we 
would like to remark that the analytical distribution ob- 
tained in this paper appears to be related to the universal 
scaling form proposed recently [l2j in the context of earth 
quakes but appears to be more generally valid. Thus it 
is likely that the exact return interval distribution might 



incorporate corrections to the one obtained in this paper. 
Indeed, if the exact distribution is known, it will also be- 
come possible to determine the precise time scales over 
which power law and exponential decay operate. This, in 
turn, should help address questions of hazard estimation 
for extreme events more carefully and, needless to say, 
this has enormous interest in the insurance industry [2| 
and as a tool for decision support system [27J . 
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